
THE UNITED STATES PATENT AND TRADEMARK OFFICE 



1125722-0005 



Applicants 



Miller, et al . 



Serial No. 



09/730,214 



Examiner: M. Borin 



Filed 



December 5, 2000 



Group Art Unit: 1631 



For 



METHOD AND SYSTEM FOR DESIGNING PROTEINS AND 
PROTEIN BACKBONE CONFIGURATIONS 



DECLARATION UNDER 37 CFR §1.132 



I, Ned Wingreen, Ph.D. declare as follows: 

I am a Professor in the Department of Molecular Biology at 
Princeton University in Princeton, New Jersey. Immediately 
prior to that, I was Senior Research Staff Member at NEC 
Laboratories America, Inc., Princeton, New Jersey, the Assignee 
of the above- identified application. Currently, I remain in the 
employ of NEC on a consultant basis. My curriculum vitae is 
attached as Exhibit A. 

I am a coinventor of the subject matter of the above- 
identified patent application, and I participated in the 
February 23, 2005 Examiner interview. I am familiar with the 
Office Actions issued during the course of prosecution of this 
application and its offspring divisional applications. The 
studies set forth below were carried out by my collaborators and 
me . 

The studies were performed to determine the effectiveness 
of the design method claimed in the above -referenced application 

NEWYORK 4785360 v2 (2K) 



i> > 



Serial No.: 09/730,214 
Filed: December 5, 2000 
Docket No. 1125722-0005 



in identifying a novel, stable fold into which a novel sequence 
of amino acids can be configured. 

Using the method of Miller et al., which is also the method disclosed and claimed in the present 
application, we identified a small, highly designable protein fold that did not appear as a stand- 
alone fold in the Protein Data Bank (PDB). [Miller, et al., Emergence of Highly Designable 
Protein-Backbone Conformations in an Off-Lattice Model. Proteins 47, 506-512 (2002), copy 
attached as Exhibit B.] For the particular analysis described below, the method consisted of the 
following steps: (1) generating backbone configurations of a preselected length n by complete 
enumeration using a set of three dihedral angle pairs, (2) assigning a sphere of radius 1.9 A to the 
beta carbon position of each residue, (3) eliminating configurations for which any of these 
spheres overlapped, (4) evaluating the surface exposure of each sphere in each remaining 
configuration, and eliminating all but the - 10,000 configurations with the lowest total surface 
exposure, (5) normalizing the surface exposure of the spheres in each remaining configuration, 
(6) generating sequences of hydrophobicities hi (= 0 or 1) of the same length as each of the 
remaining configurations, (7) determining for each sequence of hydrophobicities which of the 
remaining configurations was the ground state, (8) identifying those configurations which were 
ground states of the largest number of sequences of hydrophobicities, and (9) determining which 
of these configurations were novel, i.e. did not have a close match in the PDB. 

By following the above steps we winnowed down the number of protein backbone 
configurations which merited consideration for design. First, the very large number of 
configurations generated from all possible combinations of the three dihedral angle pairs (3 n ) 
was reduced to -10,000 by considerations of self-overlap and compactness (see steps (1) to (4) 
above). Second, the remaining -10,000 configurations were organized in a list starting from the 
configuration that was the ground state of the largest number of sequences. Configurations not 
falling near the top of this list (~ top 100) were considered unpromising for purposes of design. 
Finally, the top folds were tested for novelty by comparison with known protein backbone 
configurations in the PDB. 



NEWYORK 4785360 v2 (2K) 



-2- 



Serial No.: 09/730,214 
Filed: December 5, 2000 
Docket No. 1125722-0005 



One fold identified in this way became our target for synthesis. In terms of secondary structural 
elements, the fold consisted of a beta strand followed by an alpha helix, followed by a second 
beta strand. The beta strands folded over the alpha helix creating a two-stranded beta sheet as 
shown in Fig. 1. 



Figure 1. Ribbon diagram of beta-alpha-beta fold. The beta strands (yellow) form a beta sheet on top of the 
alpha helix (magenta). 




NEWYORK 4785360 v2 (2K) 



-3- 



Serial No.: 09/730,214 
Filed: December 5, 2000 
Docket No. 1125722-0005 



A specific amino-acid sequence, of length 33 residues, was designed to adopt the desired 
backbone configuration, based on standard considerations of packing and solvent exposure. The 
designed sequence is KRRTITLGGGEERIKKYREAFKNGNTEVTFQGQ, using the single-letter code 
for amino acids. The predicted configuration of this sequence folded into the beta-alpha-beta 
configuration is shown in Fig. 2. 



Figure 2. Predicted structure of sequence designed to adopt beta-alpha-beta fold. The detailed backbone and 
sidechain configurations for the 33-residue sequence are shown with nitrogens indicated in blue and oxygens in red. 



NEWYORK 4785360 v2 (2K) 



-4- 



Serial No.: 09/730,214 
Filed: December 5, 2000 
Docket No. 1125722-0005 



The designed protein sequence of 33 residues was synthesized chemically and subjected to 
various analyses. First, the protein proved to be highly soluble in water, which allowed for 
standard biophysical tests. Specifically, the circular dichroism (CD) spectrum was obtained and 
analyzed (Fig. 3). The measured spectrum corresponds to an alpha helical content of 28% and a 
beta-strand content of 20%, which compare very favorably with the predicted values of 30% and 
18%, respectively. 




Figure 3. Measured circular dichroism (CD) spectrum of designed 33-residue sequence. The amplitudes of the 
characteristic features in the CD spectrum correspond to a folded structure consisting of 28% alpha helix and 20% 
beta strand. 



NEWYORK 4785360 v2 (2K) 



-5- 



Serial No.: 09/730,214 
Filed: December 5, 2000 
Docket No. 1125722-0005 



The specific heat of thermal denaturation was also measured (Fig. 4) and found to be consistent 
with two-state folding, Le. a direct transition between an ensemble of unfolded configurations 
and a single folded configuration with decreasing temperature. 



-1800 




T/°C 



Figure 4. Measured specific heat of thermal denaturation of designed 33-residue sequence. The thermal 
denaturation curve (black) can be fitted extremely well by theoretical curve corresponding to two-state folding, 
consistent with the existence of a single well-folded configuration. 



NEWYORK 4785360 v2 (2K) 



-6- 



Serial No.: 09/730,214 
Filed: December 5, 2000 
Docket No. 1125722-0005 



A 1D-NMR spectrum was obtained for the designed 33-residue sequence in solution (Fig. 5). 
The clear resolution of the peaks provided critical evidence that the designed sequence was 
indeed folding into a single unique structure. The peak widths proved somewhat too broad to 
allow reconstruction of the three-dimensional structure by 2D-NMR. 




" i ""' 1 " *i t i r [. .. . l iM ,. , , , ,. ,. , . ,. . ,. .. . , ■ , ■„..■ ■ ,■ ■. . , ,.. ,„. n w , . r ,. r . H |„ 

12 11 10 9 8 7 6 5 4 3 2 1 0 -1-2 -3 



1H ppm 

Figure 5. 1D-NMR spectrum of the designed 33-residue sequence. The multiple peaks are consistent with a 
single folded structure. 



Together, the 1D-NMR spectrum and the CD spectrum provide strong evidence that not only is 
the designed sequence folding into a unique and stable structure, but that the unique structure is 
the target beta-alpha-beta fold. 



I hereby declare that all statements made herein of my own 
knowledge are true and that all statements made on information 
and belief are believed to be true and further, that these 
statements were made with the knowledge that willful false 



NEWYORK 4785360 v2 (2K) 



-7- 



Serial No.: 09/730,214 
Filed: December 5, 2000 
Docket No. 1125722-0005 

statements and the like so made are punishable by fine or 
imprisonment, or both, under Section 1001 of Title 18 of the 
United States Code and that such willful false statements may 
jeopardize the validity of the application or any patent issued 




thereon. 




Ned Wingreen, Ph.D. 



NEWYORK 4785360 v2 (2K) 



-8- 



Ned S. Wingreen 

Department of Molecular Biology - Princeton University, Princeton, NJ 08544-1014 
Phone:609-258-8476 Fax: 609-258-8616 Email: wingreen@princeton.edu 



EDUCATION 

California Institute of Technology Physics B.S. 1984 

Cornell University Physics M.S. 1988 

Cornell University Physics Ph.D. 1989 

Dissertation: Resonant Tunneling with Electron-Phonon Interaction. 

Thesis adviser: Professor John W. Wilkins. 



PROFESSIONAL EMPLOYMENT 

9/84 - 5/89 Fannie and John Hertz Foundation Fellow, Lab of Atomic and Solid State Physics, Cornell 
University 

5/89 - 9/89 Visiting Scientist, Weizmann Institute of Science, Israel 

9/89 - 9/91 Postdoctoral Associate, Physics Department, MIT, Supervisor: Patrick A. Lee 

9/91 - 3/99 Research Scientist, Physical Sciences Division, NEC Research Institute 

4/99 - 10/02 Senior Research Scientist, Physical Sciences Division, NEC Research Institute 

8/99 - 5/00 Sabbatical Visitor, University of California, Berkeley 

1 1/02 - 1/04 Senior Research Staff Member, NEC Laboratories America, Inc. 

2/04 - Present Professor, Princeton University, Department of Molecular Biology 



HONORS 
Academic: 

California Institute of Technology (1980-1984) 

Presidential Scholar (1980) 

Carnation Merit Scholarship (1982-1983) 

Caltech Merit Scholarship (1983-1984) 

Jack E. Froehlich Memorial Award (1983) 

McKinney Prize in Literature (1984) 
Cornell University (1984-1989) 

Fannie and John Hertz Foundation Fellowship ( 1 984- 1 989) 
Professional: 

Fellow of the American Physical Society 
PATENTS 

U.S. Patent No. 5,963,571, October 5, 1999, "Quantum-Dot Cascade Laser, Ned S. Wingreen 

U.S. Patent No. 5,699,215, December 16, 1997, "Non-Magnetic Magnetoresistive Reading Head Using Corbino 

Structure," Stuart A. Solin, Ned S. Wingreen 

U.S. Patent No. 5,692,003, November 25, 1997, "Quantum-Dot Cascade Laser," Ned S. Wingreen, Charles A. 
Stafford 



Wingreen CV-page 1 



PUBLICATIONS 



• Morten Kloster, Chao Tang, Ned S. Wingreen, 

Finding Regulatory Modules through Large-Scale Gene-Expression Data Analysis, 
Bioinformatics. Oct 28; [Epub ahead of print] (2004). 

• Naigong Zhang, Chen Zeng, Ned S. Wingreen, 

Fast Accurate Evaluation of Protein Solvent Exposure, 

Proteins, 57: (3): pp. 565-76 (2004). 

• Robert G. Endres, Thomas C. Schulthess, Ned S. Wingreen, 

Toward an Atomistic Model for Predicting Transcription-Factor Binding Sites, 
Proteins, 57: (2): pp. 262-8. (2004). 

• Ned S. Wingreen, 

Quantum Many-Body Effects in a Single-Electron Transistor, 
Science, 304: (5675): pp. 1258-9 (2004). 

• Kenji Hirose, Yigal Meir, Ned S. Wingreen, 

Time-Dependent Density Functional Theory of Excitation Energies of Closed-Shell Quantum Dots, 

Physica E 22: (1-3): pp. 486-489 (2004). 

• Derrick H. Lenz, Kenny C. Mok, Brendan N. Lilley, Rahul V. Kulkarni, Ned S. Wingreen, and Bonnie L. 
Bassler, 

The Small RNA Chaperone Hfq and Multiple Small RNAs Control Quorum Sensing in Vibrio harveyi 

and Vibrio cholerae, 
Ce/7118: pp. 69-82 (2004). 

• Eldon G. Emberly, Ranjan Mukhopadhyay, Chao Tang, Ned S. Wingreen, 

Flexibility of Beta-Sheets: Principal Component Analysis of Database Protein Structures, 

Proteins: Structure, Function, and Genetics 55: pp. 91-98 (2004). 

• Kerwyn Casey Huang, Yigal Meir, Ned S. Wingreen, 

Dynamic Structures in Escherichia coli: Spontaneous Formation of MinE Rings and MinD Polar Zones, 

Proceedings of the National Academy of Science 100:(22), pp. 12724-12728 (2003). 

• Ned S. Wingreen, Hao Li, Chao Tang, 

Designability and Thermal Stability of Protein Structures, 

Polymer 45: (2): pp. 699-705 (2004) 

• Ranjan Mukhopadhyay, Eldon Emberly, Chao Tang, and Ned S. Wingreen, 
Statistical Mechanics of RNA Folding: Importance of Alphabet Size, 
Physical Review E 68: pp. 041904(1-4) (2003). 

• Ned S. Wingreen, Jonathan Miller, Edward C. Cox, 
Scaling of Mutational Effects in Models for Pleiotropy, 
Genetics 164: pp. 1221-28 (2003). 

• Kenji Hirose, Ned S. Wingreen, 

Stabilization of Ground-State of Minimal Spin in Disordered Quantum Dots, 

Physica E 18: (1-3): pp. 79-80 (2003). 

• Eldon Emberly, Ranjan Mukhopadhyay, Ned S. Wingreen, Chao Tang, 

Flexibility of Alpha-Helices: Results of a Statistical Analysis of Database Protein Structures, 

Journal of Molecular Biology 327: pp. 229-37 (2003). 



Wingreen CV-page 2 



Kenny C. Mok, Ned S. Wingreen, Bonnie L. Bassler, 

Vibrio harveyi Quorum Sensing: A Coincidence Detector for Two Autoinducers Controls Gene 

Expression, 
EMBO Journal 22: pp. 870-881 (2003). 

Kenji Hirose, Yigal Meir, Ned S. Wingreen, 

Local Moment Formation in Quantum Point Contacts, 

Physical Review Letters 90:(2), pp. 026804(1-4) (2003). 

Eldon Emberly, Ned S. Wingreen, Chao Tang, 
Designability of Alpha-Helical Proteins, 

Proceedings of the National Academy of Science 99: pp. 1 1 163-8 (2002). 
Hao Li, Chao Tang, Ned S. Wingreen, 

Designability of Protein Structures: A Lattice-Model Study using the Miyazawa-Jernigan Matrix, 

Proteins: Structure, Function, and Genetics 49: pp. 403-412 (2002). 

Jonathan Miller, Chen Zeng, Ned S. Wingreen, Chao Tang, 

Emergence of Highly Designable Protein-Backbone Conformations in an Off-Lattice Model, 

Proteins: Structure, Function, and Genetics 47: pp. 506-512 (2002). 

Eldon Emberly, Jonathan Miller, Chen Zeng, Ned S. Wingreen, Chao Tang, 
Identifying Proteins of High Designability Via Surface Exposure Patterns, 

Proteins: Structure, Function, and Genetics 47:(3), pp. 295-304 (2002). 

Yigal Meir, Kenji Hirose, Ned S. Wingreen, 

Kondo Model for the 0.7 Anomaly in Transport through a Quantum Point Contact, 

Physical Review Letters 89:(19), pp. 196802(1-4) (2002). 

S. M. Cronenwett, H. J. Lynch, D. Goldhaber-Gordon, L. P. Kouwenhoven, C. M. Marcus, Kenji Hirose, Ned 
S. Wingreen, V. Umansky, 

Low-Temperature Fate of the 0.7 Structure in a Point Contact: a Kondo-Like Correlated State in an 
Open System, 

Physical Review Letters 88:(22), pp. 226805(1-4) (2002). 
Kenji Hirose, Ned S. Wingreen, 

Ground-State Energy and Spin in Disordered Quantum Dots, 

Physical Review B 65:(19), pp. 193305(1-4) (2002). 

Henry Cejtin, Jan Elder, Allan Gottlieb, Robert Helling, Hao Li, James Philbin, Chao Tang, Ned Wingreen, 
Fast Tree Search For Enumeration of a Lattice Model of Protein Folding, 

Journal of Chemical Physics 116:(1), pp. 352-359, (2002). 

Robert Helling, Hao Li, Regis Melin, Jonathan Miller, Ned S. Wingreen, Chen Zeng, Chao Tang, 
The Designability of Protein Structures, 

Journal of Molecular Graphics and Modelling 19:(1), pp. 157-167 (2001). 

Hao Li, Chao Tang, Ned S. Wingreen, 
Designing Protein Structures, 

Phase Transition and Self-Organization in Electronic and Molecular Networks, Phillips, J.C. (ed.) Klewer pp. 
441-445 (2001). 

Kenji Hirose, Shu-Shen Li, Ned S. Wingreen, 

Mechanisms for Extra Conductance Plateaus in Quantum Wires, 

Physical Review B 63:(3), pp. 033315(1-4) (2001). 



Wingreen CV~page 3 



Kenji Hirose, Fei Zhou, Ned S. Wingreen, 

Density-Functional Theory of Spin-Polarized Disordered Quantum Dots, 

Physical Review B 63:(7), pp. 075301(1-5) (2001). 

Kenji Hirose, Fei Zhou, Ned S. Wingreen, 

Spin-Density-Functional Theory of Clean and Disordered Quantum Dots, 

Proceedings of the 25th International Conference on the Physics of Semiconductor s-ICPS, Miura, N.(ed.), 
Springer, pp. 1349-1350(2001). 

Kenji Hirose, Ned S. Wingreen, 

Temperature-Dependent Suppression of Conductance in Quantum Wires: Anomalous Activation Energy 

from Pinning of the Band Edge, 

Physical Review B 64:(7), pp. 073305(1-4) (2001). 

Ned S. Wingreen, 

The Kondo Effect in Novel Systems, 

Materials Science and Engineering B 84:, pp. 22-25 (2001). 

V. Madhavan, W. Chen, T. Jamneala, M.F. Crommie, Ned S. Wingreen, 
Local Spectroscopy of a Kondo Impurity: Co on Au(lll), 
Physical Review B 64:(16), pp. 165412(1-11) (2001). 

D.E. Grupp, T. Zhang, G.J. Dolan, Ned S. Wingreen, 
Dynamical Offset Charges in Single-Electron Transistors, 

Physical Review Letters 87:(18), pp. 186805(1-4) (2001). 

Tairan Wang, Jonathan Miller, Chao Tang, Ned S. Wingreen, Ken A. Dill, 
Symmetry and Designability for Lattice Protein Models, 

Journal of Chemical Physics 113:(18), pp. 8329-8336 (2000). 

Peter Nordlander, Ned S. Wingreen, Yigal Meir, David C. Langreth, 
Kondo Physics in the Single Electron Transistor with ac Driving, 

Physical Review B 61 :(3), pp. 2146-2150 (2000). 

Regis Melin, Hao Li, Ned S. Wingreen, Chao Tang, 

Designability, Thermodynamic Stability, and Dynamics in Protein Folding: A Lattice Model Study, 

Journal of Chemical Physics 1110: pp. 1252-1262 (1999). 

Kenji Hirose, Ned S. Wingreen, 

Spin-Density-Functional Theory of Circular and Elliptical Quantum Dots, 

Physical Review B 59:(7), pp. 4604-4607 (1999). 

Peter Nordlander, Michael Pustilnik, Yigal Meir, Ned S. Wingreen, David C. Langreth, 
How Long Does it Take for the Kondo Effect to Develop? , 
Physical Review Letters 83: (4), pp. 808-81 1 (1999). 

Igor E. Smolyarenko, Ned S. Wingreen, 
Kondo Effect in Systems With Spin Disorder, 

Physical Review B 60:(13), pp. 9675-9689 (1999). 

K. Hirose, N. S. Wingreen, 

Electronic Structure Calculations of Quantum Dots, 

NEC Research and Development 40:(4), pp. 419-423 (1999). 

C. Heide, R. J. Elliott, Ned S. Wingreen, 

Spin-Polarized Tunnel Current in Magnetic-Layer Systems and its Relation to the Interlayer Exchange 



Wingreen CV-page 4 



Interaction, 

Physical Review B 59:(6), pp. 4287-4304 (1999). 
P. Jauho, Ned S. Wingreen, 

Theory of Phase-Sensitive Measurement of Photon-Assisted Tunneling Through a Quantum Dot, 

Physical Review B 58:(15), pp. 9619-9622 (1998). 

N. S. Wingreen, B. L. Altshuler, Y. Meir, 

Erratum: Comment on "2-Channel Kondo Scaling in Conductance Signals from 2-Level Tunneling 
Systems", 

Physical Review Letters 81:(19), pp. 4280 (1998). 

Naama Barkai, Mark D. Rose, Ned S. Wingreen, 
Protease Helps Yeast Find Mating Partners, 
Nature 396:(6710), pp. 422-423 (1998). 

V. Madhavan, W. Chen, T. Jamneala, M.F. Crommie, Ned S. Wingreen, 

Tunneling into a Single Magnetic Atom: Spectroscopic Evidence of the Kondo Resonance, 

Science 280: pp. 567-569 (1998). 

Hao Li, Chao Tang, Ned S. Wingreen, 
Are Protein Folds Atypical? 

Proceedings of the National Academy of Science 95: pp. 4987-4990 (1998). 

L. P. Kouwenhoven, C. M. Marcus, P. L. McEuen, S. Tarucha, R. M. Westervelt, N. S. Wingreen, 
Electron Transport in Quantum Dots, 

Proceedings of the NATO Advanced Study Institute on Mesoscopic Electron Transport edited by L.L. Sohn, L.P. 
Kouwenhoven, and G. Schon (Kluwer Series E345) pp. 105-204 (1997). 

L. Aleiner, Ned S. Wingreen, Yigal Meir, 

Dephasing and the Orthogonality Catastrophe in Tunneling Through a Quantum Dot: The "Which 
Path?" Interferometer, 

Physical Review Letters 79: pp. 3740-3743 (1997). 
Hao Li, Chao Tang, Ned S. Wingreen, 

Nature of Driving Force for Protein Folding: A Result from Analyzing the Statistical Potential, 

Physical Review Letters 79: pp. 765-768 (1997). 

Ned S. Wingreen, Charles A. Stafford, 

Quantum-Dot Cascade Laser: Proposal for an Ultralow-Threshold Semiconductor Laser, 

IEEE Journal of Quantum Electronics 33: pp. 1 170-1 173 (1997). 

Oded Agam, Ned S. Wingreen, Boris Altshuler, D. C. Ralph, M. Tinkham, 

Chaos, Interactions, and Nonequilibrium Effects in the Tunneling Resonance Spectra of Ultrasmall 
Metallic Particles, 

Physical Review Letters 78: pp. 1956-1959 (1997). 

A. Yacoby, H.L. Stormer, Ned S. Wingreen, L. N. Pfeiffer, K. W. Baldwin, K. W. West, 
Nonuniversal Conductance Quantization in Quantum Wires, 

Physical Review Letters 17: pp. 4612-4615 (1996). 

N.F. Schwabe, RJ. Elliott, Ned S. Wingreen, 

The Ruderman-Kittel-Kasuya-Yosida (RKKY) Interaction Across a Tunneling Junction Out of 
Equilibrium, 

Physical Review B 54: pp. 12953-12968 (1996). 



Wingreen CV-page 5 



Noam Sivan, Ned S. Wingreen, 

The Single Impurity Anderson Model Out of Equilibrium, 

Physical Review B 54: pp. 1 1622-1 1629 (1996). 

Hao Li, Robert Helling, Chao Tang, Ned S. Wingreen, 

Emergence of Preferred Structures in a Simple Model of Protein Folding 

Science 273: pp. 666-669 (1996). 

C. A. Stafford, Ned S. Wingreen, 

Resonant Photon-assisted Tunneling Through a Double Quantum Dot: An Electron Pump from Spatial 
Rabi Oscillations, 

Physical Review Letters 76: pp. 1916-1919 (1996). 

Ned S. Wingreen, Eugen Schenfeld, 

Size-speed Trade-off in Optical Switching Elements, 

Applied Optics 34: pp. 5907-5912 (1995). 

Ned S. Wingreen, Boris Altshuler, Yigal Meir, 

Comment on "2-Channel Kondo Scaling in Conductance Signals from 2-Level Tunneling Systems," 

Physical Review Letters 75: pp. 769 (1995). 

Yigal Meir, Ned S. Wingreen, 

Spin-orbit Scattering and the Kondo Effect, 

Physical Review B (Rapid Communications) 50: pp. 4947-4950 (1994). 
Antti-Pekka Jauho, Ned S. Wingreen, Yigal Meir, 

Time-dependent Transport in Interacting and Noninteracting Resonant-tunneling Systems, 
Physical Review B 50: pp. 5528-5544 (1994). 

Antti-Pekka Jauho, Ned S. Wingreen, Yigal Meir, 

Time-dependent Transport in Mesoscopic Systems: General Formalism and Applications, 

Semiconductor Science and Technology 9: pp. 926-929 (1994). 

Ned S. Wingreen, Yigal Meir, 

Anderson Model out of Equilibrium: Noncrossing-approximation Approach to Transport Through a 
Quantum Dot, 

Physical Review B 49: pp. 1 1040-1 1052 (1994). 
Mark Lee, Ned S. Wingreen, S. A. Solin, P. A. Wolff, 

Giant Growth Axis Longitudinal Magnetoresistance from In-plane Conduction in Semiconductor 
Superlattices, 

Solid State Communications 89: pp. 687-691 (1994). 

A. Alan Middleton, Ned S. Wingreen, 

Collective Transport in Arrays of Quantum Dots, 

Physical Review Letters 71: pp. 3198-3201(1993). 

Jari M. Kinaret, Ned S. Wingreen, 

Coulomb Blockade and Partially Transparent Tunneling Barriers in the Quantum Hall Regime, 
Physical Review B 48: pp. 1 1 1 13-1 1 1 19 (1993). 

Ned S. Wingreen, Antti-Pekka Jauho, Yigal Meir, 
Time-dependent Transport Through a Mesoscopic Structure, 

Physical Review B (Rapid Communications) 48: pp. 8487-8490 (1993). 



Wingreen CV-page 6 



P. L. McEuen, Ned S. Wingreen, E. B. Foxman, Jari Kinaret, U. Meirav, M. A. Kastner, Yigal Meir, 
Coulomb Interactions and Energy-level Spectrum of a Small Electron Gas, 
Physica B 189: pp. 70-79 (1993). 

E. B. Foxman, P. L. McEuen, U. Meirav, Ned S. Wingreen, Yigal Meir, Paul A. Belk, N. R. Belk, M. A. 
Kastner, S. J. Wind, 

Effects of Quantum Levels on Transport Through a Coulomb Island, 

Physical Review B (Rapid Communications) 47: pp. 10020-10023 (1993). 

Yigal Meir, Ned S. Wingreen, Patrick A. Lee, 

Low-temperature Transport Through a Quantum Dot: The Anderson Model out of Equilibrium, 

Physical Review Letters 70: pp. 2601-2604 (1993). 

Jari M. Kinaret, Yigal Meir, Ned S. Wingreen, Patrick Lee, Xiao-Gang Wen, 
Conductance Through a Quantum Dot in the Fractional Quantum Hall Regime, 

Physical Review B (Rapid Communications) 45: pp. 9489-9492 (1992). 

Jari M. Kinaret, Yigal Meir, Ned S. Wingreen, Patrick Lee, Xiao-Gang Wen, 

Many-body Coherence Effects in Conduction Through a Quantum Dot in the Fractional Quantum Hall 
Regime, 

Physical Review B 46: pp. 4681-4689 (1992). 
Yigal Meir, Ned S. Wingreen, 

Landauer Formula for for the Current Through an Interacting Electron Region, 

Physical Review Letters 68: pp. 2512-2515 (1992). 

Jari M. Kinaret, Yigal Meir, Ned S. Wingreen, Patrick Lee, Xiao-Gang Wen, 
Conductance Through a Quantum Dot in the Fractional Quantum Hall Regime, 

Physical Review B (Rapid Communications) 45: pp. 9489-9492 (1992). 

P. L. McEuen, E. B. Foxman, Jari Kinaret, U. Meirav, M. A. Kastner, Ned S. Wingreen, S. J. Wind, 
Self-consistent Addition Spectrum of a Coulomb Island in the Quantum Hall Regime, 

Physical Review B (Rapid Communications) 45: pp. 1 1419-1 1422 (1992). 

P. L. McEuen, E.B. Foxman, U. Meirav, M.A. Kastner, Yigal Meir, Ned S. Wingreen, 
Transport Spectroscopy of a Coulomb Island in the Quantum Hall Regime, 

Physical Review Letters 66: pp. 1926-1929 (1991). 

Yigal Meir, Ned S. Wingreen, Ora Entin-Wohlman, Boris L. Altshuler, 

Spin-Orbit Scattering for Localized Electrons: Absence of Negative Magnetoconductance, 

Physical Review Letters 66: pp. 1517-1520 (1991). 

Yigal Meir, Ned S. Wingreen, Patrick A. Lee, 

Transport Through a Strongly Interacting Electron System: Theory of Periodic Conductance 
Oscillations, 

Physical Review Letters 66: pp. 3048-3051 (1991). 
Ned S. Wingreen, 

Rectification by Resonant Tunneling Diodes, 

Applied Physics Letters 56: pp. 253-255 (1990). 

Ned S. Wingreen, Karsten W. Jacobsen, John W. Wilkins, 
Inelastic Scattering in Resonant Tunneling, 
Physical Review B 40: pp. 1 1834-1 1850 (1989). 



Wingreen CV-page 7 



Ned S. Wingreen, Monique Combescot, 

Electron-electron Scattering: Collision Integral and Relaxation Rate, 

Physical Review B 40: pp. 3191-3196 (1989). 

Ned S. Wingreen, Monique Combescot, 

Ohm's Law for Hot Carriers: the Role of Carrier-carrier Scattering at High Fields, 

Solid State Communications 70: pp. 185-189 (1989). 

Ned S. Wingreen, Karsten W. Jacobsen, John W. Wilkins, 

Resonant Tunneling with Electron-Phonon Interaction: An Exactly Solvable Model, 

Physical Review Letters 61: pp. 1396-1399 (1988). 

Ned S. Wingreen, Chris J. Stanton, John W. Wilkins, 

Electron-electron Scattering in Nondegenerate Semiconductors: Driving the Anisotropic Distribution 
Toward a Displaced Maxwellian, 

Physical Review Letters 57: pp. 1084-1087 (1986). 



Wingreen CV-page 8 



PROTEINS: Structure, Function, and Genetics 47:506-512 (2002) 



Emergence of Highly Designable Protein-Backbone 
Conformations in an Off-Lattice Model 

Jonathan Miller, Chen Zeng, Ned S. Wingreen, and Chao Tang* 

NEC Research Institute, Princeton, New Jersey 



ABSTRACT Despite the variety of protein sizes, 
shapes, and backbone configurations found in na- 
ture, the design of novel protein folds remains an 
open problem. Within simple lattice models it has 
been shown that all structures are not equally suit- 
able for design. Rather, certain structures are distin- 
guished by unusually high designability: the num- 
ber of amino acid sequences for which they represent 
the unique lowest energy state; sequences associ- 
ated with such structures possess both robustness 
to mutation and thermodynamic stability. Here we 
report that highly designable backbone conforma- 
tions also emerge in a realistic off-lattice model. The 
highly designable conformations of a chain of 23 
amino acids are identified and found to be remark- 
ably insensitive to model parameters. Although some 
of these conformations correspond closely to known 
natural protein folds, such as the zinc finger and the 
helix-turn-helix motifs, others do not resemble 
known folds and may be candidates for novel fold 
design. Proteins 2002;47:506-512. 
© 2002 Wiley-Liss, Inc. 

Key words: protein folds; off-lattice model; design- 
ability; protein design; evolution 

INTRODUCTION 

The de novo design of proteins — an object of enormous 
activity in recent years — has so far dealt primarily with 
the redesign of known protein folds. 1-8 Two major accom- 
plishments in the direction of designing a fold that is 
distinct from known natural folds are the synthesis of a 
right-handed coiled coil 9 and the synthesis of a zinc finger 
without zinc. 10 " 12 To challenge the best efforts of de novo 
design, nature offers roughly 1000 qualitatively distinct 
protein folds. 13 Why has it proven difficult to design new 
protein folds? What program should we follow to achieve 
ab initio design of novel folds? 

The principle of designability 14 ~ 19 offers an answer to 
both these questions for simple lattice models. The design- 
ability of a structure is measured by the number of 
sequences that design it, that is, the number of sequences 
that have the given structure as their unique lowest 
energy conformation. Structures can differ vastly in their 
designability, 14 and it has been shown that high designabil- 
ity entails other protein-like properties, such as muta- 
tional stability, thermodynamic stability, 14,15 and fast 
folding kinetics. 16,20 Design is hard in the sense that most 
structures have low designability and their associated 



sequences lack these protein-like properties. For success- 
ful de novo design, one should first identify the few highly 
designable structures. 

It is an open question whether designability applies to 
real proteins as it does to lattice polymers. Real protein 
structures have a degree of complexity that cannot be 
effectively represented within a simple lattice model. For 
example, on a lattice the angles between bonds differ from 
those naturally adopted in real proteins. In addition, 
although in a cubic-lattice model the cube minimizes 
surface area for a given volume and is perfectly packed, no 
counterpart of the perfect cube exists once the lattice is 
removed. For designability to guide practical design of new 
folds it must apply to realistic descriptions of protein 
structure. 

In this article we report the computation of designability 
within an off-lattice model that incorporates angles fa- 
vored by natural proteins, for protein chains of up to N = 
23 amino acids. We find that the essential qualitative 
features of designability survive the transition from lattice 
model to off-lattice model. In particular, it remains true 
that a small fraction of compact structures are highly 
designable: these are nondegenerate ground states for an 
enormous number of amino acid sequences. Most struc- 
tures, on the other hand, are ground states for few, if any, 
amino acid sequences. Furthermore, the sequences that 
fold into highly designable structures typically have en- 
hanced thermodynamic stability — the energy of the near- 
est excited state is separated from the ground-state energy 
by an appreciable gap. 

MODELS AND METHODS 

The model we adopt is closely related to the off-lattice, 
m-state discrete-angle model introduced by Park and 
Levitt. 21 Each configuration is denned by a sequence of C a 
bonds of length 3.8 A, and each pair of dihedral angles (<(>, 
i|0 is restricted to one of only m alternatives; here we take 
m — 3. The set of m allowed angle pairs is chosen by fitting 
to the backbone coordinates of representative natural 
proteins, 21 as discussed below. To suppress self-intersec- 
tions of the chain, we augment the model by introducing a 



C. Zeng*s present address is Department of Physics, George Wash- 
ington University, Washington, DC 20052. 

♦Correspondence to: Chao Tang, NEC Research Institute, 4 Indepen- 
dence Way, Princeton, NJ 08540. E-mail: tang@research.nj.nec.com 

Received 23 October 2001; Accepted 7 January 2002 



© 2002 WILEY-USS, INC. 



PROTEIN-BACKBONE CONFORMATIONS 



507 





b 







Fig. 1. a-c: Backbone configurations of 1st, 4th, and 15th most 
designable 23-mer structures, d: Backbone configuration of the zinc 
finger 1 PSV, 12 truncated to 23 amino acids. 



volume for the amino acid residues in the form of a sphere 
of radius r p centered on C p (the first carbon of the 
side-chain). The backbones of some configurations con- 
structed in this fashion are shown in Fig. l(a-c). 

This off-lattice model incorporates properties of real 
polymers not well reproduced in simple lattice models. On 
the lattice, for example, allowed ground-state structures 
were limited to those maximally compact structures that 
fill the unique rectangle or box of minimum surface area. 
Off the lattice, every structure can be expected to have a 
distinct surface area. However, open or extended struc- 
tures are not expected to be designable. We entertain as 
plausible ground-state structures only those with a sur- 
face area below some cutoff value A c , which enters our 
computation as a parameter.* 

Because a discrete angle set represents only a crude 
approximation to a continuum of angles, it is unrealistic to 
expect the surface area of a discrete-angle structure to 
faithfully reproduce the surface area of a structure built 
from more flexible angles. Importantly, using flexible 
angles would allow our more open structures (e.g., those 
just below the cutoff A c ) to contract and reduce their 
exposed surface areas. To achieve this equalizing effect of 
a continuum of angles within the limitations of a discrete- 
angle model, we normalize the vector of solvent-accessible 
surface areas A = (a lf . . . , a N )> where a t is the solvent- 
accessible surface area of the f-th residue, in such a way as 
to preserve the pattern of surface exposure along a chain. 



A suitable procedure 1 " is to normalize the vector A for each 
structure by the total exposed surface area of that struc- 
ture: A = Aflpi = (a lf . . . , d N ). This procedure treats all 
structures below the cutoff A c as equally compact while 
preserving each structure's individual pattern of surface 
exposure along the chain. 

As with real proteins, description and comparison of 
configurations off-lattice demands precision about what 
we mean by the term "structure." For example, a protein 
structure obtained by NMR represents an ensemble of 
configurations, no element of which necessarily provides a 
better fit to the data than any other. This ensemble 
presumably reproduces the temperature-induced fluctua- 
tions of a natural protein around its native state. On 
averaging over this ensemble for small stably folded 
polypeptides in the PDB database, one finds a typical 
center-of-mass root mean square (crms) of roughly 0.3-0.5 
A per residue, A similar range of crms can be inferred from 
the B values of protein crystals. 23 Accordingly, our off- 
lattice polymer configurations are grouped into clusters 
consisting of all configurations lying within a crms dis- 
tance \ per residue of one another. Configurations within a 
cluster are to be thought of as variations of a single 
structure, and subsequently we will refer to clusters and 
structures interchangeably. 

We define the designability of a structure as the sum of 
the designabilities of its included configurations. The 
designability of a configuration is simply the number of 
sequences with that configuration as a unique ground 
state. 14,15 To evaluate the energy of a sequence on each 
configuration, we associate a hydrophobicity h t with each 
amino acid of the sequence. In practice, we assign a 
hydrophobicity which is either 0 (Polar) or 1 (Hydrophobic) 
to each monomer to create an HP-sequence 24 ; that this is a 
reasonable simplification finds support in the work of 
Beasley and Hecht 1 [cf. Fig. 3(e) for the results of a more 
general choice] . The energy of a particular sequence folded 
into a particular configuration is obtained by taking the 
sum of the products of each amino acid's hydrophobicity /i,- 
with its normalized surface exposure d i} 

£=2>w- a) 

We numerically evaluate the energy of all HP-sequences 
for all configurations. 

Except as indicated explicitly in the text, we choose 
discrete angles and the amino acid radius to optimize the 
fit to the backbone of the zinc-less synthetic zinc finger 12 
1PSV [Fig. 1(d)] . We find that there are many angle sets 
that fit the backbone of 1PSV almost equally well. For 
example, the crms per residue between 1PSV and the 
structure obtained from each of our 10 best angle sets 
varies from 0.844 to 0.913 A. The angle set we use for most 



*We evaluate the area of each C p sphere accessible to a probe sphere 
of radius 1.4 A, by the methods used in the program SERF, the 
slightly different values of surface area obtained by different methods 
do not in any way alter the outcome of the calculations. 



have checked that certain alternative normalizations (e.g., 
normalizing by the total solvent-inaccessible surface area) do not alter 
the set of highly designable structures that emerge from our calcula- 
tion. With no normalization, higher designability becomes closely 
correlated with lower solvent-accessible surface area. 



508 



J. MILLER ET AL. 



100 




Fig. 2. Histogram of designabilities of 23-mer structures, using r p = 
1.9 A. The surface area cutoff A c is such that 10,000 configurations 
participate in the calculation, grouped into 4688 clusters with cluster 
radius \ = 0.4 A. 



of the calculations presented in this article is (<t>, +) = 
(-95°, 135°), (-75°, -25°), and (-55°, -55°). The first pah- 
lies in the p-region of the Ramachandran plot, and the 
other two pairs lie in the a-region. We take r p = 1.9 A, the 
radius above which the amino acids fit to the backbone of 
1PSV would clash. 

RESULTS 

The designability of a structure denotes the number of 
distinct HP-sequences having that structure as their unique 
ground state. The distribution of designabilities for our 
model, displayed in Figure 2, reproduces a crucial feature 
first observed on the lattice: although most structures 
have very low designability, the trailing edge (or tail) of 
the distribution consists of a small number of structures of 
very high designability. Thus, designability distinguishes 
a small subset of structures from generic ones. 

It turns out that the identities of these highly designable 
structures depend only weakly on the values of the param- 
eters that enter our calculation: the surface area cutoff A c , 
clustering radius X, side-chain radius r p , the set of allowed 
dihedral angles, and the range of amino acid hydrophobici- 
ties. More specifically, a significant fraction of structures 
identified as highly designable for one set of parameter 
values remains highly designable when these parameters 
are varied. We provide evidence for this important observa- 
tion in the next five subsections. 

Surface Area Cutoff 

As discussed before, open structures are expected to 
exhibit low designability. We anticipate that the highly 
designable structures of interest to us will fall mainly 
within the class of compact structures; therefore, only 
these compact structures are needed in our calculation. 
The surface area cutoff A c determines how compact a 
structure must be to qualify. We expect that, provided the 
choice of A c is not too restrictive, its particular value ought 
not to be important. 



A computationally practical choice of the surface-area 
cutoff eliminates most of the less compact configurations. 
A few of these might have proven highly designable if 
retained; however, our objective is not to find all highly 
designable structures, but only to identify some of them. 
Therefore, our major concern is not that we might incor- 
rectly discard a few designable structures, but rather that 
we might produce false positives (structures that appear to 
be highly designable with a restrictive value of the cutoff 
but have low designability for a more relaxed cutoff). A 
larger cutoff admits previously disallowed configurations 
that "steal" some sequences from a configuration originally 
identified as highly designable, thereby reducing its design- 
ability. 

In practice, as shown in Figure 3(a), highly designable 
structures tend to remain highly designable with increas- 
ing surface-area cutoff. For example, 9 of the 10 most 
designable structures remain within the 100 most design- 
able even after the surface-area cutoff is relaxed suffi- 
ciently to admit a 10-fold increase in the number of 
participating structures. 

Clustering Radius 

As discussed in the previous section, structures whose 
backbones differ insignificantly from one another ought 
not to be considered distinct. This observation is embodied 
in our calculation by grouping into clusters those struc- 
tures whose backbone configurations he within a certain 
crms distance, X, of one another. Varying the clustering 
radius, X, leaves unchanged the set of configurations that 
participate in the calculation. For X ^ 0.1 A, nearly every 
cluster consists of a unique configuration. To exhibit the 
dependence of the most designable structures on X, we fix a 
configuration and follow the designability of the cluster to 
which that configuration belongs, as a function of X. As 
shown in Figure 3(b), the most designable structures 
remain roughly the same as X is varied over a wide range. 

Side-Chain Radius 

Excluded volume is incorporated by means of a hard 
sphere of radius r p centered on the p-carbon of each amino 
acid. Increasing the side-chain radius r 3 eliminates some 
configurations because of steric clashes, whereas decreas- 
ing r p admits previously ineligible configurations. Starting 
at r p = 1.9 A, we identify the most designable structures 
and then count the fraction of these structures that remain 
highly designable as r p is reduced. As shown in Figure 3(c), 
the identities of the most designable structures are well 
preserved. 

Set of Dihedral Angles 

Next, we address to what extent an outcome depends on 
a particular choice of the discrete set of dihedral angles. A 
discrete set of angles cannot sample the structure space 
fully and so cannot "hit" all possible structures. On the 
other hand, we know that the designability of a structure 
depends on the local density of solvent-exposure vectors A 
with highly designable structures occupying the lowest 
density regions. 15 If the subset of structures sampled by a 
discrete set of angles reasonably preserves density in the 



PROTEIN-BACKBONE CONFORMATIONS 



509 



100 



<D 

o> 

CO 

c 
a> 

2 
I 




b 100 



a> 

03 

c 

CD 
2 

a. 



1000 4000 7000 10000 
Number of structures 



100 



80 

60 



2 40 



20 



CD 
O) 
CO 

c 

OJ 

o 
o 




- G © 10/100 

Q □ 20/1 00 

40/100 
60/100 



d ioo 



© 

CO 
c 

0) 

Q. 



1.90 1.88 



1.86 1.84 

r 0 (A) 



1.82 1.80 




2 3 4 
Number of angle sets 




2 4 6 8 
N s (x10 4 ) (HP) 

Fig. 3. Sensitivity to parameter changes of the most designable structures from Figure 2. a: Fraction of the 
10, 20, 40, or 60 most designable structures that remain in the 100 most designable as the surface-area cutoff 
increases. The initial cutoff A c is chosen so that only the 1000 most compact configurations participate and A c 
increases until 10,000 configurations participate, b: Fraction of the 10, 20, 30, or 40 most designable structures 
that remain in the 50 most designable as the clustering radius \ is increased. The 5000 most compact 
configurations participate in the calculation and r p - 1 .9 A. c: Fraction of the 10, 20, 40, or 60 most designable 
structures that remain in the 100 most designable as the side-chain radius r p is changed. We have chosen the 
surface area cutoff so that 5000 structures participate in the designability calculation for r p = 1.9 A. If some 
configurations of the original most designable structures are not among the 5000 most compact configurations 
for some smaller r p , we nevertheless retain them in the calculation. The clustering radius is \ = 0.4 A. d: 
Fraction of the 10, 40, 70, or 100 most designable structures that remain in the 100 most designable as 
configurations from other angle sets are added. The values of the five angle sets are as follows set #1 = (-95°, 
135°), (-75°, -25°), (-55°, -55°); set #2 = (-95°, 135°), (-85°, -55°), (-65°, -25°); set #3 = (-105°, 
145°), (-85°, -15°), (-75°, -35°); set #4 = (-105°, 145°), (-85°, -35°), (-85°, -5°); set #5 - (-105°, 
145°), (-85°, -35°), (-85°, -15°). e: Designability of structures obtained from 4,000,000 randomly generated 
sequences of real numbers in [0,1] versus designability from enumeration of HP-sequences. The 10000 most 
compact configurations participate in the calculation, X - 0.4 A, and r p = 1 .9 A. (Note: the suppressed zeros in 
panels a, b, and d.) 



space of structures, highly designable structures should 
remain highly designable as we improve our sampling of 
structure space. 

To examine this possibility, we identify configurations 
generated by one angle set and follow their cluster design- 
abilities as configurations from other angle sets are added. 
We take five different angle sets derived from fitting to 
1PSV, and use the most compact configurations generated 
by each set. We calculate the designability of structures by 



using configurations from, respectively, one, two, three, 
four, and finally all five sets. We observe in Figure 3(d) 
that the most designable structures in set #1 remain 
highly designable even as configurations from sets #2, #3, 
#4, and #5 are added. This result is maintained under 
permutation of the five sets. Apparently, any reasonable 
choice of angle set covers the structure space sufficiently 
well that highly designable structures can be identified 
with high probability. 



510 



J. MILLER ETAL. 



: — i i 1 1 n i ij j — i i i t n i ij — i i i muj| — i i i mnj — i i i i nn 




— i i i miit i <t unit i i until i i limit i iiniiil 

10° 10 1 10 s 10 3 10* 10 8 
N s 

Fig. 4. Maximum energy gap (red dots) and average energy gap 
(black dots) for the HP-sequences that design a given structure, plotted 
versus structure designability. The 1 0,000 most compact configurations of 
the 23-mer participate in the calculation, with X = 0.4 A and r p = 1 .9 A. 



HP Sequences 

To check whether the identification of designable struc- 
tures depends on our use of HP (binary) sequences of 
amino acids, we recalculate designabilities by using amino 
acids with continuous real-valued hydrophobicities. We 
randomly choose 4,000,000 sequences h - (h ly . . . , h N ) t 
where /i, 6 [0,1], and evaluate their energy for all configu- 
rations using Eq. (1). In Figure 3(e) we plot the designabil- 
ity calculated this way against that from the enumeration 
of HP sequences. As the figure shows, the highly design- 
able structures computed by these two alternative meth- 
ods are nearly identical. 

Parameter Independence 

In the preceding five subsections we have shown that the 
parameters can sustain a considerable degree of variation 
without significantly changing the outcome of the design- 
ability calculation. The weak dependence of the set of 
highly designable structures on parameters is illustrated 
in Figure 3. Because the identity of the highly designable 
structures is robust to parameter variation, we now exam- 
ine their potential as candidates for design. 

Gap 

In particular, a prerequisite for design is believed to be 
the presence of a large separation between the ground- 
state energy and the energy of the lowest excited state. For 
each structure, we have identified the HP-sequence that 
makes this gap the largest. The value of this largest gap is 
shown in Figure 4, as a function of the designability of the 
structure. To convert the vertical scale of Figure 4 to real 
energies, we observe that one unit of energy corresponds to 
a sequence of exclusively hydrophobic amino acids (h t = 1) 
folded into one of our typical compact structures. Our 



choice of surface area cutoff A c guarantees that a typical 
compact configuration has around half of its maximal 
accessible surface exposed (about 25 A 2 per residue). A 
conservative estimate for the energy of exposed surface, 23 
20 cal/A 2 /mol, then yields an energy on the order of 10 
kcal/mol for a 23-mer. The highest gap energies achieved 
in Figure 4, of order 0.05, therefore correspond to a gap of 
0.5 kcal/mol, around kpTfor room temperature. This gap is 
roughly the energy to promote one hydrophobic amino acid 
from core to surface. Also plotted is the average gap for all 
HP-sequences that design a structure. It is evident that 
high designability correlates strongly with a large gap. 

DISCUSSION AND CONCLUSION 

The principle of designability is that some structures are 
intrinsically easier to design than others. However, up to 
now, designability has been shown only in highly restric- 
tive lattice models. Our calculations indicate that the 
qualitative features of designability in lattice models are 
also exhibited off-lattice. Namely, a small minority of 
off-lattice structures are distinguished by high designabil- 
ity: these structures are lowest-energy states for many 
more than their share of sequences. Moreover, the se- 
quences associated with these structures have enhanced 
thermodynamic stability. The work presented here, using 
an off-lattice model for protein-backbone configurations, 
makes it more plausible that designability applies to real 
proteins. Of course, the model used in the current study is 
highly simplified — it is a low-resolution discrete model of 
short chain with a very simple potential function. There is 
still a long way to go to show the designability principle in 
real proteins. 

Nonetheless, the insensitivity to model parameters of 
the results presented suggests that our highly designable 
structures are possible candidates for real protein design. 
It is therefore worthwhile to study some of our best 
candidates in detail and to understand what architectural 
properties distinguish the most designable structures from 
the least designable ones and how the most designable 
ones compare with known natural structures. 

Representative configurations of some of the most design- 
able structures are shown in Figure l(a-c). A striking 
characteristic of the highly designable structures is that 
each has a well-defined core consisting of a small subset of 
the amino acids of the chain. For example, in Figure 5 we 
have plotted the inaccessible surface area of each amino 
acid along the chain for the configuration appearing in 
Figure 1(b). Observe that 5 of the 23 amino acids are more 
than 70% buried. Also shown in Figure 5 is the probability 
that a hydrophobic amino acid occupies a particular site, 
averaged over all HP-sequences that design the structure, 
revealing the preference of hydrophobic amino acids for 
the core. 

A quantitative measure of the core in a structure is the 
variance v s of the exposure vector A: v s = (177V) 2; df - 
(1/JV 2 ) didi) 2 . In Figure 6, we plot v s versus the designabil- 
ity N 0 . On average the two quantities correlate well; 
however, the scatter of the data is large in the region of low 



PROTEIN-BACKBONE CONFORMATIONS 



511 




0 3 6 9 12 15 18 21 24 

Residue site 

Fig. 5. Solid bars: Inaccessible surface for residues (C p spheres) of 
the highly designable configuration shown in Figure 1(b). Hollow bars: 
Probability, averaged over all HP-sequences that design the configura- 
tion, that a particular site along the chain is occupied by a hydrophobic 
amino acid. 




N s (x10 4 ) 

Fig. 6. The average variance v s of a cluster against the designability 
N s of the cluster for the 23-mer. The 5000 most compact configurations 
participate in the calculation, \ = 0.4 A, and r p = 1 .9 A. Gray line: running 
average with bin size 30. 

N s : structures with well-formed cores are not necessarily 
highly designable. 

A zinc finger-like fold emerges from our calculation as 
one of the most designable structures. The fold [Fig. 1(6)] 
does not simply replicate 1PSV [Fig. 1(d)], on which we 
optimized our angle set. The structure of 1PSV is too open 
to be designable within our model because the small, 
uniformly sized side-chains cannot fill the large opening 
between the a-helix and the fJ-fJ turn in 1PSV. It is of 
interest that the model produces a highly designable 
solution by collapsing the a-helix onto the turn. 

Another of our most designable structures is similar to 
another small natural fold, the helix-turn-helix [see Fig. 
1(c)]. Some of our most designable structures [e.g., that 
shown in Fig. 1(a)] do not resemble any known natural 




Fig. 7. a: Backbone configuration of the 1 1 th most designable 23-mer 
structure, using untargeted angle set (see text): (<|>, it) = (-55°, t 135°), 
(-126°, 145°), and (-85°, -25°), with a mean crms of 3.6 A on a 
representative subset of natural structures segmented into subchains of 
21 amino acids. For this calculation, the amino acids are represented by 
spheres of radius r a = 1.52 A centered on the C a carbons only, b: 
Backbone configuration of the zinc finger 1 NC8, truncated to 23 amino 
acids. 25 



folds. These structures are candidates for the design of 
truly novel folds. 

Targeting a fold by fitting the angle set to a single chosen 
structure is not essential. For example, we can obtain a 
suitable angle set by choosing two pairs of dihedral angles 
(4>, *|/) within the 0-sheet region and one pair from the 
a-helix region, locally optimizing on 160 representative 
natural structures from the PDB database. 21 Among the 
most designable structures emerging for this angle set is 
the zinc finger-like structure in Figure 7(a), shown next to 
its apparent natural counterpart, 1NC8 [Fig. 7(b)]. 25 

Recently, many studies have been conducted on the 
relation between the folding kinetics and the topology of 
native states. 26 " 36 In particular, it has been shown that 
folding rates and the topology of the transition states are 
closely related to the topology of the native states. In other 
words, the native state topology, which in this context is 
often measured in terms of contact order, 26 ' 35 largely 
determines how a protein folds. It would be interesting to 
compare the two roles the native state topology plays: in 
folding kinetics and in the designability and thermody- 
namic stability. However, such a comparative study would 
preferably be done in systems of longer chains than used in 
the current study. Although it is tempting to think that 
there is a deep connection between the two roles of 
topology, one should note that there is a huge variation in 
folding rates among natural proteins, 33 which are presum- 
ably highly designable and thermodynamically stable. It 
appears that designability is largely governed by the 
surface-core patterning, 15 whereas folding kinetics de- 
pends more on the ease of forming native contacts (the 
contact order). 

In summary, we have computed the designabilities of 
structures within an off-lattice model of realistic protein- 
backbone configurations. Highly designable structures 
emerge with remarkable insensitivity to model parame- 
ters. The sequences that design these structures have 
strongly enhanced mutational stability and a large energy 
gap between the native fold and the lowest non-native 
conformation. In this light, it is interesting that recent 
mutation studies on some small proteins show that they 
maintain their native folds even when about half of their 
residues are replaced by alanine. 37,38 Some of our highly 



512 



J. MILLER ETAL. 



designable structures correspond closely to natural folds, 
such as the zinc finger and helix-turn-helix motifs. Others 
do not resemble existing structures and are candidates for 
ab initio design of novel protein folds. 

REFERENCES 

1. Beasley JR, Hecht MH. Protein design: the choice of de novo 
sequences. J Biol Chem 1997;272:2031-2034. 

2. Baltzer L. Functionalization of designed folded polypeptides. Curr 
Opin Struct Biol 1998;8:466-470. 

3. Cao AN, Lai LH, Tang YQ. The current state and prospect of de 
novo protein design. Prog Biochera Biophys 1998;25:197-201. 

4. Giver L, Arnold FH. Combinatorial protein design by in vitro 
recombination. Curr Opin Chem Biol 1998;2:335-338. 

5. Regan L, Wells J. Engineering and design: recent adventures in 
molecular design — editorial overview. Curr Opin Struct Biol 
1998;8:441-442. 

6. Schafmeister CE, Stroud RM. Helical protein design. Curr Opin 
Biotechnol 1998;9:350-353. 

7. Shakhnovich EI. Protein design: a perspective from simple trac- 
table models. Fold Design 1998;3:R45-R58. 

8. DeGrado WF, Summa CM, Pavone V, Nastri F, Lombardi A. De 
novo designa and structural characterization of proteins and 
metalloproteins. Annu Rev Biochem 1999;68:779-819. 

9. Harbury PB, Plecs JJ, Tidor B, Alber T, Kim PS. High-resolution 
protein design with backbone freedom. Science 1998;282:1462- 
1467. 

10. Struthers MD, Cheng RP, Imperiali B. Design of a monomelic 
23-residue polypeptide with denned tertiary structure. Science 
1996;271:342-345. 

11. Dahiyat BI, Mayo SL. De novo protein design: fully automated 
sequence selection. Science 1997;278:82-87. 

12. Dahiyat BI, Sarisky CA, Mayo SL. De novo protein design: 
towards fully automated sequence selection. J Mol Biol 1997; 273: 
789-796. 

13. Chothia C. One thousand families for the molecular biologist. 
Nature 1992;357:543-544. 

14. Li H, Helling R, Tang C, Wingreen N. Emergence of preferred 
structures in a simple model of protein folding. Science 1996;273: 
666-669. 

15. Li H, Tang C, Wingreen NS. Are protein folds atypical? Proc Natl 
Acad Sci USA 1998;95:4987-4990. 

16. Govindarajan S, Goldstein RA. Searching for foldable protein 
structures using optimized energy functions. Biopolymers 1995;36: 
43-51. 

17. Govindarajan S, Goldstein RA. Why are some protein structures 
so common? Proc Natl Acad Sci USA 1996;93:3341-3345. 

18. Finkelstein AV, Ptitsyn OB. Why do globular proteins fit the 
limited set of folding patterns? Prog Biophys Mol Biol 1987;50:171- 
190. 

19. Yue K, Dill KA. Forces of tertiary structural organization in 
globular proteins. Proc Natl Acad Sci USA 1995;92:146-150. 



20. M&in R, Li H, Wingreen NS, Tang C. Design ability, thermody- 
namic stability, and dynamics in protein folding: a lattice model 
study. J Chem Phys 1999;110:1252-1262. 

21. Park BH, Levitt M. The complexity and accuracy of discrete state 
models of protein structure. J Mol Biol 1995;249:493-507. 

22. Flower DR. SERF: A program for accessible surface area calcula- 
tions. J Mol Graph Model 1997;15:238-244. 

23. Creighton TE. Proteins. New York: Freeman; 1993. pl60-162, 
236-237. 

24. Lau KF, Dill KA. Lattice statistical mechanics model of the 
conformational and sequence spaces of proteins. Macro molecules 
1989;22:3986-3997. 

25. Kodera Y, Sato K, Tsukahara T, Komatsu H, Maeda T, Kohno T. 
High-resolution solution NMR structure of the minimal active 
domain of the human immunodeficiency virus type-2 nucleocapsid 
protein. Biochemistry 1998;37:17704-17713. 

26. Plaxco KW, Simons KT, Baker D. Contact order, transition state 
placement and the refolding rates of single domain proteins. J Mol 
Biol 1998;277:985-994. 

27. Chan HS. Protein folding: matching speed and locality. Nature 
1998;392:761-763. 

28. Portman JJ, Takada S, Wolynes PG. Variational theory for site 
resolved protein folding free energy surfaces. Phys Rev Lett 
1998;81:5237-5240. 

29. Goldenberg DP. Finding the right fold. Nat Struct Biol 1999;6:987- 
990. 

30. Aim E, Baker D. Matching theory and experiment in protein 
folding. Curr Opin Struct Biol 1999;9:189-196. 

31. Fersht AR. Transition-state structure as a unifying basis in 
protein-folding mechanisms: contact order, chain topology, stabil- 
ity, and the extended nucleus mechanism. Proc Natl Acad Sci USA 
2000;97:1525-1529. 

32. Maritan A, Micheletti C, Banavar JR. Role of secondary Motifs in 
fast folding polymers: a dynamical variational principle. Phys Rev 
Lett 2000;84:3009-3012. 

33. Baker D. A surprising simplicity to protein folding. Nature 
2000;405:39-42. 

34. Clemen ti C, Nymeyerson Onuchic JN. Topological and energetic 
factors: what determines the structural details of the transition 
state ensemble and "en-route" intermediates for protein folding? 
An investigation for small globular proteins. J Mol Biol 2000; 298: 
937-953. 

35. Plaxco KW, Simons KT, Ruczinski I, Baker D. Sequence, stability, 
topology and length; the determinants of two-state protein folding 
kinetics. Biochemistry 2000;39:11177-11183. 

36. Guerois R, Serrano L Protein design based on folding models. 
Curr Opin Struct Biol 2001;11:101-106. 

37. Kuroda Y, Kim PS. Folding of bovine pancreatic trypsin inhibitor 
(BPTI) variants in which almost half the residues are alanine. J 
Mol Biol 2000;298:493-501. 

38. Brown BM, Sauer RT. Tolerance of Arc repressor to multiple- 
alanine substitutions. Proc Natl Acad Sci USA 1999;96:1983- 
1988. 



